05 - Raster manipulation
REMEMBER: delete eval=FALSE to knit!!
#install.packages("knitr")
#install.packages("rmdformats")
#install.packages("formatR") #needs this to be able to knit, apparently
library(sf)## Warning: pakke 'sf' blev bygget under R version 4.1.2
## Linking to GEOS 3.9.1, GDAL 3.2.1, PROJ 7.2.1; sf_use_s2() is TRUE
library(knitr)## Warning: pakke 'knitr' blev bygget under R version 4.1.2
library(rmdformats)## Warning: pakke 'rmdformats' blev bygget under R version 4.1.2
library(formatR)## Warning: pakke 'formatR' blev bygget under R version 4.1.2
## Global options
options(max.print="75")
opts_chunk$set(echo=TRUE,
cache=TRUE,
prompt=FALSE,
tidy=TRUE,
comment=NA,
message=FALSE,
warning=FALSE)
opts_knit$set(width=75)Welcome to Bulgaria
Throughout these exercises, you will be working with two different types of satellite imagery for the Kazanlak Valley in central Bulgaria. The images are provided by the JICA and GeoEye Foundation respectively:
- Aster contains the elevation values for the area.
- IKONOS image is a 4-band satellite image covering ca 150 sq km of
the Kazanlak Valley with red, green, blue and infrared band, captured in
2001. As it is a fairly large orthophoto, I divided it into two halves
with each half at 2 GB and placed them in ScienceData. You can download
the full-resolution images manually from public www.sciencedata.dk
folder, or directly with with
file.download()using these direct links for West and East respectively. In exercise 6 you will play with a slightly reduced East image, which is provided as part of Github data folder.
You will learn to work with these rasters and plot vector data over them.
Task 1: Access raster data values
Raster data can be very big depending on the extent and resolution
(grid size). In order to deal with this the raster() and
brick() functions are designed to only read in the actual
raster values as needed. To show that this is true, you can use the
inMemory() function on an object and it will return
FALSE if the values are not in memory. If you use the
head() function, the raster package will read
in only the values needed, not the full set of values. The raster values
will be read in by default if you perform spatial analysis operations
that require it or you can read in the values from a raster manually
with the function getValues().
Instructions
- Activate
rasterandrgdallibrary - Use
GDALinfo()to inspect the properties of the Aster image raster in the data folder (the file name is “Aster.tif”). What can you learn from this inspection about its bands, resolution, and projection? - Read in the Aster image. Review Week 02 raster loading if unsure about how.
- Use the
inMemory()function on the elevation object to determine if the data has been read in. - Use the
head()function to look at the first few values from the elevation raster. - Use the
getValues()function on the elevation object to read in all the data. - Use the
hist()function to create a quick histogram of the elevation values. Note the pile of values near -9999, these should beNA(any idea why?) and we will address this later.
# Library install.packages('raster') install.packages('rgdal')
library(raster)
library(rgdal)
# Read in the elevation layer
elevation <- raster("data/Aster.tif")
# Check if the data is in memory
inMemory(elevation)[1] FALSE
# Use head() to peak at the first few records
head(elevation) 1 2 3 4 5 6 7 8 9 10 11 12
1 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
2 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
3 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
13 14 15 16 17 18 19 20
1 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
2 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
3 -9999 -9999 -9999 -9999 -9999 -9999 -9999 -9999
[ reached getOption("max.print") -- omitted 7 rows ]
# Use getValues() to read the values into a vector
vals <- getValues(elevation)
# Use hist() to create a histogram of the values
hist(vals)Congratulations! You now know that the raster package
only reads in raster values as needed to save space in memory. You can
get the values manually using the getValues() function.
Now, a new question arises, why are so many values encoded as -9999? Are we in the Mariana Trench all of sudden?
Task 2: Change values: handle missing or bad data values in rasters
There are many situations where you might need to recode raster
values. You may want to change the outlier values to NA for
example. In the raster package, reclassification is
performed with the reclassify() function.
In the elevation raster you’ve worked with the values
are meters above sea level and are supposed to range between 0 and 2500.
Anything below 0 should be an NA. In this exercise you will
assign any values below 0 to NA.
Instructions
- Check that the package
rasterand the objectelevationare loaded in your workspace. - Plot the
elevationraster usingplot(). - Set up a three-column matrix with the
cbind()function and values -10000, 0, NA. - Use the matrix and
reclassify()to reclassify values below 0 toNA. You will need to use the argumentrcl. - Plot the reclassified elevation layer to confirm there are no values below 0.
# Plot the elevation layer to see the legend values
plot(elevation)# Set up the matrix
vals <- cbind(-10000, 0, NA)
# Reclassify
elevation_reclass <- reclassify(elevation, rcl = vals)
# Plot again and confirm that the legend range is 0 - 2400
plot(elevation_reclass)Good work! Knowing how to reclassify rasters will come in handy. When
you get a chance you should review the help for
reclassify() particularly the part that discusses how to
specify the rcl argument. The three-column approach from
this exercise is most common but there are other approaches.
Task 3: Limit rasters to focus areas
Mask and crop are similar operations that allow you to limit your
raster to a specific area of interest. With mask() you
essentially place your area of interest on top of the raster and any
raster cells outside of the boundary are assigned NA
values. With crop() you limit the extent of your raster to
that of your focus area.
The Aster image covers a large area, but we are primarily interested in the areas surveyed by archaeologists, the first of which is the cluster of burial mound points and second, large survey polygons registered by local archaeologists during field survey. In this exercise you will use the vectors of mounds and survey units to crop and mask the elevation raster.
Note: In the past the raster package did not support
sf objects. This should not happen now, but if you
encounter difficulty with raster:vector interactions, it helps to know
that you can convert the vector to Spatial object with, for
example, as(input, "Spatial").
Instructions I
- Load
sflibrary - Create
moundsobject from the shapefile “KAZ_mounds.shp”. For a refresher on vector loading, check out Week 02 instructions. - Verify that
moundsCRS matches theelevationand fix withst_transform()if not. - Create a bounding box around the mounds with
st_make_grid()following Week 03 guidelines. - Plot the
mounds_bband themoundsto see how these two objects relate. - Crop the
elevationlayer by themoundsobject and create a smallerelevobject - Plot the
elevandmoundsand themounds_bbtogether.
# Load sf library
library(sf)
# Create mounds object from the shapefile “KAZ_mounds.shp”.
mounds <- st_read("data/KAZ_mounds.shp")Reading layer `KAZ_mounds' from data source
`C:\Users\sofie\Desktop\cds-spatial w5\Week05\data\KAZ_mounds.shp'
using driver `ESRI Shapefile'
Simple feature collection with 773 features and 5 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 352481.3 ymin: 4712325 xmax: 371282.4 ymax: 4730029
Projected CRS: WGS 84 / UTM zone 35N
# Verify that mounds CRS matches the elevation and fix with st_transform() if
# not.
st_crs(mounds) == st_crs(elevation)[1] TRUE
# Create a bounding box around the mounds with st_make_grid() following Week 03
# guidelines.
mounds_bb <- st_make_grid(mounds, n = 1)
# Plot the mounds_bb and the mounds to see how these two objects relate.
plot(mounds_bb)plot(mounds)# Crop the elevation layer by the mounds object and create a smaller elev
# object
elev <- crop(elevation, mounds)
# Plot the elev and mounds and the mounds_bb together.
plot(elev)
plot(st_geometry(mounds_bb), add = TRUE)
plot(st_geometry(mounds), add = TRUE)Instructions II
- Create
surveyobject from shapefile called “KAZ_units.shp” - Project the
surveyobject to match the newelevraster withst_transform()if needed. - Compute the area of the survey with
st_area()and save this object asareas. What units are these? - Filter the survey units to only those above 30000 square meters with
the
filter()function. You will need to wrapareasinunclass(). Save assurvey_big. Remember to have the tidyverse or dplyr library attached forfilter()to work properly. Also sf might interfere so specify the dplyr::filter() if needed.
# Read in the survey object
survey <- st_read("data/KAZ_units.shp")Reading layer `KAZ_units' from data source
`C:\Users\sofie\Desktop\cds-spatial w5\Week05\data\KAZ_units.shp'
using driver `ESRI Shapefile'
Simple feature collection with 7708 features and 25 fields (with 1 geometry empty)
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 352181.7 ymin: 4712923 xmax: 370892.1 ymax: 4729960
Projected CRS: WGS 84 / UTM zone 35N
# install.packages('tidyverse')
library(tidyverse) #activate package (needed for the slice function later etc.)
# Compute the area of the survey
areas <- st_area(survey)
# Filter to survey with areas > 30000
survey_big <- dplyr::filter(survey, unclass(areas) > 30000)Instructions III
- Review the plot of
elevraster. - Plot the geometry of the
survey_bigover it. - Mask the
elevlayer withsurvey_bigand save aselevation_mask. This may take a couple of seconds. - Review the plot of
elevation_mask.
# Plot the elevation raster
plot(elev)# Plot the geometry of survey_big
plot(st_geometry(survey_big))# Mask the elev layer with survey_big and save as elevation_mask
elevation_mask <- mask(elev, mask = survey_big)
# Plot elevation_mask -- this is a raster!
plot(elevation_mask)Nice! You ensured that layers had the same CRS, you cropped raster,
computed bounding boxes and areas for masking and filtered the data.
Finally, you used mask() to mask the elevation raster to
the large survey units.
Question:
1. What extent does the elevation raster default to after cropping by mounds vs by masking by the large polygons of survey_big?
2. Why do we not use the mounds bounding box to crop the elevation raster?
Task 4: Reduce the raster grid cell size using aggregate
Rasters, such as ortophotos, terrain models, or aggregated rasters for large area, often come is resolutions far greater than you need. Up till now you have played with fairly small, processed rasters. Now you are getting a taste of the real thing. To reduce the computational load when running analyses, or when developing the right approach, you might wish to use a reduced resolution raster.
The function to reduce resolution in rasters is
aggregate() which, as you might guess, aggregates grid
cells into larger grid cells using a user-defined function (for example,
mean or max). The function used to aggregate the values is determined by
the fun argument (the default being mean) and the amount of
aggregation is driven by the fact argument (the default
being 2).
Instructions
- Load the East half of the two IKONOS images of the Kazanlak Valley from (“KazE.tif”).
- Plot the file you read in with the appropriate function for a multi-band raster.
- Determine the raster resolution using
res()and number of raster cells in the layer withncell(). - Aggregate the IKONOS image using the default for fun and a factor of
10 and save the new raster to
east_small. - Plot the new raster with
plot()for comparison to the old version. - Determine the new raster resolution and the number of raster cells.
# Load the IKONOS raster
east <- raster("data/KazE.tif")
# Plot the IKONOS raster
plot(east)# Determine the raster resolution
res(east)[1] 10 10
# Determine the number of cells
ncell(east)[1] 2374596
# Aggregate the raster
east_small <- aggregate(east, fact = 10)
# Plot the new east layer
plot(east_small)# Determine the new raster resolution
res(east_small)[1] 100 100
# Determine the number of cells in the new raster
ncell(east_small)[1] 23940
Lovely job! In this example you read in a raster and then converted it to a lower resolution raster to save on the size of the object and ultimately computation power required. In this example, the raster was not too big to begin with so perhaps aggregating would not be necessary but for big rasters such as you will see next week, this can be a big help and a necessity.
Task 5: Extract raster values by location
Beyond simply masking and cropping you may want to know the actual
cell values at locations of interest. You might, for example, want to
know the elevation at each mound location or perhaps the mean elevation
within the large survey units. This is where the extract()
function comes in handy.
Usefully, and you’ll see this in a later analysis, you can feed
extract() a function that will get applied to extracted
cells. For example, you can use extract() to extract raster
values by point, line, polygon, or neighborhood and with the
fun = mean argument it will return an average cell value by
neighborhood.
Instructions
- Ensure mounds are still in memory.
- Use the
rasterfunctionextract()to determine the elevation at each of the mounds. Assign the extracted value into aelevcolumn in the mounds object. Beware that the extract() function exists across multiple packages, so it’s wise to use raster:: in front of it. - Look at the
moundselevation column through the plot() function as well as histogram. Do theextract()results make sense? Theelevationlayer values represent meters above sea level.
# Extract the elevation values at the sites
mounds$elev <- raster::extract(elevation, mounds)
# Look at the sites and extraction results
plot(mounds$elev) #changing Adela's scripthist(mounds$elev)Great! raster::extract() is a very useful tool. It can
be used with polygons as well as points, and the result can be written
out as a separate vector or added to an existing object as a column.
Task 6: Raster math with overlay
You will now use the elevation layer and an “prominence”
layer. Prominence measures what percentage of surrounding cells are
visible from any given location in the raster. A high percentage value
thus indicates a prominent point, while a low percentage indicates a
low-lying location with poor inter-visibility. Archaeologists assert
that mounds had a good field of visibility, so let’s see whether they
are mostly right :)
What you will do in this exercise is essentially identify the most
prominent locations among the registered archaeological mounds by
finding areas that have both a high percentage of prominence and high
elevation. You will use two rasters and you will define an overlay
function f to do the raster math with. Note that you can
only do raster math on two rasters of the same extent, so make sure you
align the elevation and prominence accordingly with crop()
function
Instructions
- Make sure elevation object exists in memory (“Aster.tif”, it is a single-band raster).
- Read in the prominence raster layer from “prominence.tif”; it is also a single-band raster.
- Plot the prominence object. Do the legend units make sense?
- Specify function f, where you look for elevation values over 400 msl and below 650m (mounds don’t appear above that elevation) and prominence values over 60%
- Call
overlay()onelevationandprominence. Set thefunargument tof. - If you have not cropped the two rasters to the same extent yet, you
can do so inside the
overlay()function
# Check in on the elevation and read in prominence layer
elevationclass : RasterLayer
dimensions : 2484, 2592, 6438528 (nrow, ncol, ncell)
resolution : 30, 30 (x, y)
extent : 313183.7, 390943.7, 4676606, 4751126 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=35 +datum=WGS84 +units=m +no_defs
source : Aster.tif
names : Aster
values : -9999, 2347 (min, max)
prominence <- raster("data/prominence1500m.tif")
# Plot prominence
plot(prominence)# Function f with 2 arguments and the raster math code
f <- function(rast1, rast2) {
rast1 > 400 & rast1 < 650 & rast2 > 60
}
# Align the extent of the two rasters with crop()
cropping <- raster::crop(elevation, prominence)
# Do the overlay using f as fun.
elevation_prom_overlay <- overlay(cropping, prominence, fun = f)
# Plot the result (low elevation and high prominence areas)
plot(elevation_prom_overlay)Congratulations! You’ve now learned to perform raster math using the
raster function overlay(). You limited to areas with >
400m & <650m elevation and > 60% prominence, these areas
should harbor the most prominent mounds (or defense-ready areas if
associated with settlements) in Kazanlak.
Questions:
3. What is the actual value range in the
prom_el_overlay raster?
elevation_prom_overlayclass : RasterLayer
dimensions : 1201, 1201, 1442401 (nrow, ncol, ncell)
resolution : 30, 30 (x, y)
extent : 340153.7, 376183.7, 4700126, 4736156 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=35 +datum=WGS84 +units=m +no_defs
source : memory
names : layer
values : 0, 1 (min, max)
4. What area is covered by each cell?
Task 7 - HOMEWORK: Inspect the most prominent mounds
In the above exercise, we produced an elevation-prominence overlay. Mounds and other sites that sit in this overlay enjoy a strategic position vis-a-vis the rest of the valley. Which ones are they, however, and what are their real prominence values? It is hard to see at the scale of the valley and it would be good to pull the sites out.
Find the mounds that enjoy the most prominent locations as well as those that feature in the elevation-prominence overlay raster. Produce a list of the ID numbers (TRAP_Code) of all the overlay mounds and 25 most prominent mounds and plot them (expressing their prominence somehow) .
Instructions
Find or recreate the
moundssf object,prominenceandprom-el-overlayrasters.Plot the
prom-el-overlayand themoundson top of each other to check visual overlap. Do the same withprominenceraster andmounds.Extract values from the elevation-prominence overlay raster and from the prominence raster at mounds and write them to two columns:
– call the first column
prom_el_overlay– name the second columnprominenceMake an object of mounds that sit within these strategic high-visibility locations. How many are there?
Make an object of 25 mounds with the highest prominence values. Which
TRAP_Codeids are included?Plot both these sets of mounds in
mapview()and compare their locations.
#Find or recreate the mounds sf object, prominence and prom-el-overlay rasters.
# prominence
# mounds
# elevation_prom_overlay
#Plot the prom-el-overlay and the mounds on top of each other to check visual overlap.
plot(elevation_prom_overlay)
plot(st_geometry(mounds), add = TRUE)#Do the same with prominence raster and mounds.
plot(prominence)
plot(st_geometry(mounds), add = TRUE)#Extract values from the elevation-prominence overlay raster and from the prominence raster at mounds and write them to two columns:
mounds$prom_el_overlay <- raster::extract(elevation_prom_overlay, mounds) #prom_el_overlay column
mounds$prominence <- raster::extract(prominence, mounds) #prominence column
#Make an object of mounds that sit within these strategic high-visibility locations.
mounds_visibility <- raster::crop(elevation_prom_overlay, mounds)
#How many are there?
nrow(mounds_visibility)[1] 590
#Make an object of 25 mounds with the highest prominence values. Which TRAP_Code ids are included?
library(tidyverse)
mounds_highest_prom <- mounds %>%
dplyr::arrange(desc(prominence)) %>% #descending order
dplyr::slice(1:25)
#Plot both these sets of mounds in mapview() and compare their locations.
#install.packages("mapview") #installing package to make map
library(mapview)
mapview(mounds_visibility) +
mapview(mounds_highest_prom)Question:
5. How do the mounds with high prom_el_overlay values differ from those with high prominence?
elevation_prom_overlayclass : RasterLayer
dimensions : 1201, 1201, 1442401 (nrow, ncol, ncell)
resolution : 30, 30 (x, y)
extent : 340153.7, 376183.7, 4700126, 4736156 (xmin, xmax, ymin, ymax)
crs : +proj=utm +zone=35 +datum=WGS84 +units=m +no_defs
source : memory
names : layer
values : 0, 1 (min, max)
mounds_highest_promSimple feature collection with 25 features and 8 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 353119.6 ymin: 4713055 xmax: 365234.8 ymax: 4724245
Projected CRS: WGS 84 / UTM zone 35N
First 10 features:
Notes TRAP_Code Cluster Lat Long elev prom_el_overlay prominence
1 Mound 2002 3 42.62582 25.32606 405 1 96.57824
2 Mound 3112 2 42.65666 25.20794 459 1 96.42445
3 Mound 3105 2 42.65654 25.21010 456 1 96.15533
4 Mound 4057 3 42.58800 25.31027 441 1 95.50173
5 Mound 4058 3 42.58772 25.31029 442 1 94.23299
6 Mound 3722 3 42.59022 25.26064 458 1 94.00230
7 Mound 4059 3 42.58756 25.31044 441 1 93.96386
8 Mound 3417 3 42.59130 25.26528 454 1 93.84852
geometry
1 POINT (362733.2 4720622)
2 POINT (353119.6 4724245)
3 POINT (353295.9 4724228)
4 POINT (361354.8 4716448)
5 POINT (361355.8 4716418)
6 POINT (357287.2 4716777)
7 POINT (361367.3 4716400)
8 POINT (357670.3 4716889)
[ reached 'max' / getOption("max.print") -- omitted 2 rows ]
Task 8 - OPTIONAL HOMEWORK: Sum raster values across polygons - assessing prominence of past settlements
Previously you have extracted values from computed rasters at points. More often, you will need to extract values using more complex vector shapes where you will need to decide how to summarize the raster values that fall in. In this exercise, you will load the layer of ancient settlements for the Kazanlak Valley and sum the overlapping elevation_prom_overlay values within the settlement polygons. Then you will find out which of the sites contained at least 0.5 ha of strategic high-visibility area.
Instructions
- Create a
sitessf object by reading in polygons from “KAZ_nuclei.shp” - Plot the elevation-prominence overlay and the sites on top of each other
- Use
maskfunction to create asites_maskraster, where raster cells under sites’ polygons retain their original values, while cells outside the polygons are turned toNA. - Use
cropfunction to cropsites_maskraster to the extent of the sites’ polygons. - Plot the result ( I recommend
mapview()) - Produce a list of sites that have at least 0.5 ha (ie. 6 grid cells)
within these strategic high-visibility locations. (Hint:you can first
extract a list of overlapping sites and then use
sapply()to filter out the ones > 6)
# Read in the site polygons
sites <- ______("data/KAZ_nuclei.shp")
# Plot the sites
plot(elevation_prom_overlay);plot(__________)
# Create a raster mask from the elevation_prom_overlay using the sites
sites_mask <- _____________
# Crop the raster mask to the sites
sites_strat <- _____________
# Plot and compare the results
plot(sites_mask)
plot(sites_strat)
# Produce a list of the most prominent sitesLovely work! Manipulating rasters and mastering raster-vector interactions is the bread and butter of spatial data wrangling :)